reachability relation
Supplementary Materials for Nimble: Lightweight and Parallel GPU T ask Scheduling for Deep Learning Appendix A Proofs on the Stream Assignment Algorithm of Nimble
In this section, we provide detailed proofs on the theorems presented in Section 4.2. We assume that the computation graph of a neural network is given. Here we define important concepts and terminologies used in the following proofs. F or any (u,v) E, f ( u) = f (v) or there exists a path P E from u to v such that P Λ null= . Prior to the proof of Theorem 1-2, we describe and prove Lemma 1 and Lemma 2. Lemma 1. We will prove by contradiction.
Understanding and Enhancing the Planning Capability of Language Models via Multi-Token Prediction
Zhong, Qimin, Liao, Hao, Wang, Siwei, Zhou, Mingyang, Wu, Xiaoqun, Mao, Rui, Chen, Wei
Large Language Models (LLMs) have achieved impressive performance across diverse tasks but continue to struggle with learning transitive relations, a cornerstone for complex planning. To address this issue, we investigate the Multi-Token Prediction (MTP) paradigm and its impact to transitive relation learning. We theoretically analyze the MTP paradigm using a Transformer architecture composed of a shared output head and a transfer layer. Our analysis reveals that the transfer layer gradually learns the multi-step adjacency information, which in turn enables the backbone model to capture unobserved transitive reachability relations beyond those directly present in the training data, albeit with some inevitable noise in adjacency estimation. Building on this foundation, we propose two strategies to enhance the transfer layer and overall learning quality: Next-Token Injection (NTI) and a Transformer-based transfer layer. Our experiments on both synthetic graphs and the Blocksworld planning benchmark validate our theoretical findings and demonstrate that the improvements significantly enhance the model's path-planning capability. These findings deepen our understanding of how Transformers with MTP learn in complex planning tasks, and provide practical strategies to overcome the transitivity bottleneck, paving the way toward structurally aware and general-purpose planning models.
Reconciling Spatial and Temporal Abstractions for Goal Representation
Zadem, Mehdi, Mover, Sergio, Nguyen, Sao Mai
Goal representation affects the performance of Hierarchical Reinforcement Learning (HRL) algorithms by decomposing the complex learning problem into easier subtasks. Recent studies show that representations that preserve temporally abstract environment dynamics are successful in solving difficult problems and provide theoretical guarantees for optimality. These methods however cannot scale to tasks where environment dynamics increase in complexity i.e. the temporally abstract transition relations depend on larger number of variables. On the other hand, other efforts have tried to use spatial abstraction to mitigate the previous issues. Their limitations include scalability to high dimensional environments and dependency on prior knowledge. In this paper, we propose a novel three-layer HRL algorithm that introduces, at different levels of the hierarchy, both a spatial and a temporal goal abstraction. We provide a theoretical study of the regret bounds of the learned policies. We evaluate the approach on complex continuous control tasks, demonstrating the effectiveness of spatial and temporal abstractions learned by this approach.
Uras
Search using subgoal graphs is a recent preprocessing-based path-planning algorithm that can find shortest paths on 8-neighbor grids several orders of magnitude faster than A*, while requiring little preprocessing time and memory overhead. In this paper, we first generalize the ideas behind subgoal graphs to a framework that can be specialized to different types of environments (represented as weighted directed graphs) through the choice of a reachability relation. Intuitively, a reachability relation identifies pairs of vertices for which a shortest path can be found quickly. A subgoal graph can then be constructed as an overlay graph that is guaranteed to have edges only between vertices that satisfy the reachability relation, which allows one to find shortest paths on the original graph quickly. In the context of this general framework, subgoal graphs on grids use freespace-reachability (originally called h-reachability) as the reachability relation, which holds for pairs of vertices if and only if their distance on the grid with blocked cells is equal to their distance on the grid without blocked cells (freespace assumption). We apply this framework to state lattices by using variants of freespace-reachability as the reachability relation. We provide preliminary results on (x,y,theta)-state lattices, which shows that subgoal graphs can be used to speed up path planning on state lattices as well, although the speed-up is not as significant as it is on grids.
Feasibility Study: Subgoal Graphs on State Lattices
Uras, Tansel (University of Southern California) | Koenig, Sven (University of Southern California)
Search using subgoal graphs is a recent preprocessing-based path-planning algorithm that can find shortest paths on 8-neighbor grids several orders of magnitude faster than A*, while requiring little preprocessing time and memory overhead. In this paper, we first generalize the ideas behind subgoal graphs to a framework that can be specialized to different types of environments (represented as weighted directed graphs) through the choice of a reachability relation. Intuitively, a reachability relation identifies pairs of vertices for which a shortest path can be found quickly. A subgoal graph can then be constructed as an overlay graph that is guaranteed to have edges only between vertices that satisfy the reachability relation, which allows one to find shortest paths on the original graph quickly. In the context of this general framework, subgoal graphs on grids use freespace-reachability (originally called h-reachability) as the reachability relation, which holds for pairs of vertices if and only if their distance on the grid with blocked cells is equal to their distance on the grid without blocked cells (freespace assumption). We apply this framework to state lattices by using variants of freespace-reachability as the reachability relation. We provide preliminary results on (x,y,theta)-state lattices, which shows that subgoal graphs can be used to speed up path planning on state lattices as well, although the speed-up is not as significant as it is on grids.